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Abstract. We specify a small set, consisting of 0(d(loglogd) 2 ) points, 
that intersects the basins under Newton's method of all roots of all (suit- 
ably normalized) complex polynomials of fixed degrees d, with arbitrar- 
ily high probability. This set is an efficient and universal probabilistic 
set of starting points to find all roots of polynomials of degree d using 
Newton's method; the best known deterministic set of starting points 
consists of [l.ld(logd) 2 ] points. 



1. Introduction 

Newton's root-finding method is as old as analysis, but still not well un- 
derstood, even in the fundamental case of finding all roots of a polynomial 
in a single variable. Its local convergence properties are well known; near 
simple roots convergence is quadratic and thus extremely rapid. However, 
the global dynamical properties are insufficiently understood so that nu- 
merical analysis algorithms often use different global methods, and resort to 
Newton's method for a final local "polishing" of the roots. 

This article is a contribution towards a better understanding of the global 
properties of Newton's method, applied to polynomials in a single complex 
variable. Even for polynomials over the reals, and even if all the roots are 
real, it is often preferable to use complex methods; see Figure 1. 

Among the difficulties with Newton's method are the following: 

• if an orbit under iteration comes close to a critical point of the poly- 
nomial, the Newton map sends the orbit far away near oo, so that 
control of the dynamics is lost, and in any case a large number of 
iterations are required until the orbit comes back to where the roots 
are; 

• there are polynomials with open sets of starting points that do not 
converge to any root (Smale [SI] asked, in 1984, for a classification 
of such polynomials; an answer has recently been given by Mikulich 
in current work [Mi]); 

• the boundary of the basins of convergence for the roots may have 
positive planar Lebesgue measure (this follows from recent work by 
Buff and Cheritat on the existence of Julia sets with positive measure 

l 
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Figure 1. Dynamical planes of Newton maps of two com- 
plex polynomials. Different colors illustrate basins of at- 
traction of different roots; shades of color illustrate different 
speeds of convergence. It is clearly visible that all immediate 
basins are unbounded and have one or several channels to oo 
of different widths. Left: a polynomial of degree 7. Right: 
a polynomial of degree 11 with all roots real. Some of the 
roots are very close to each other; however, away from the 
disk containing all the roots, the basins and their channels 
all have almost uniform width, so that finding the real roots 
using complex methods is much easier. 

[BC], combined with Douady and Hubbard's renormalization theory 
DIO: 

• even if almost every point in C converges to some root under the 
Newton iteration, our goal is to find all roots of the polynomial, 
and with bounded complexity. Finding some roots and deflating 
is usually not an option, because deflation is in general numerically 
unstable (unless the roots are found in a specific order), and because 
deflation might not be compatible with the way the polynomial may 
be specified, or evaluated efficiently (for instance, if the polynomial 
itself is given by an efficient iteration procedure) . 

See [Rii] for a recent survey of known results on Newton's method. 

This article is a contribution towards the goal of turning Newton's method 
into an efficient algorithm. To achieve this goal, one should: 

• select a finite set Sd of good starting points that are guaranteed to 
intersect the basins of all roots; 

• specify a condition when to stop iterating any of these starting points, 
because the orbit is either sufficiently close to a root, or the orbit is 
discarded in favor of some other starting points; 
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Figure 2. The dynamical plane of the polynomial p{z) = 
z(z 10 — 1): the ten roots of unity each have one "thick" chan- 
nel, while the root z = has 10 channels (red) which are all 
rather "thin". The deterministic method from [HSS] would 
search for the individual thin channels and thus requires more 
points, while our method searches for the union of all thin 
channels, which together are much bigger. 

• give a good bound on the complexity of Newton's method to find all 
roots of the polynomial with prescribed precision. 
This article is concerned with the first of these questions; we will not dis- 
cuss the other two issues in detail (see for instance [S2, Ru]). Concerning 
efficiency of the Newton method, we mention the following recent result 
from [Schl, Sch2, ABS]: roughly speaking, for "most" polynomials p of 
degree d, properly normalized, our universal set Sd contains d points that 
converge to the d different roots of p so that the total number of Newton 
iterations, for all d roots combined, to achieve an accuracy of e is at most 
0(d 2 log 4 d) + <i log | loge|. This makes it possible to turn Newton's method 
into an efficient algorithm for the problem of finding all roots of a given 
polynomial. 

To state our main result, let Vd be the space of polynomials of degree d, 
normalized so that all roots are contained in the complex unit disk D. 

Theorem 1 (Small Probabilistic Universal Set of Starting Points). 

For every degree d > 3, there is an explicit universal probabilistic set Sd 
consisting of 0(d(loglogd) 2 ) starting points so that for every polynomial 
p £ Vd, the probability is greater than 1/2 that the immediate basin of each 
root of p contains at least one point in Sd (in fact, this probability is greater 
than l-l/d> 2/3). 
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Remark 1 . The meaning of an "explicit and universal" probabilistic set is as 
follows: we give an explicit probability distribution of starting points that 
depends only on d so that for any p 6 Vd, with probability at least 1 — 1/d 
all immediate basins contain at least one point in this set. (The probability 
1 — 1/d may seem somewhat artificial; it is what we get naturally from of our 
estimates, and it is better than the uniform 2/3.) Of course, enlarging this 
set of points appropriately, the probability of success can be increased (see 
Remark 7): For every probability p £ (0, 1), there is an explicit and universal 
set Sd tP of starting points with cardinality 0(d(loglog(i) 2 + d\ log(l — 
such that the statement of Theorem 1 is true with probability p instead of 



This result is in a similar spirit as [HSS], where a similar explicit universal 
set of starting points is constructed. It consists of |~l.l<i(log(i) 2 ] points and 
is deterministic. Our new set is significantly smaller than the deterministic 
set, much closer to the "ideal lower bound" of d points, but we can do so only 
using a probabilistic set. We believe that there is no deterministic explicit 
and universal set of starting points with o(dlogd) points. 

Construction of the set Sd- Our set Sd is constructed as follows: firstly, 



for some R > 1 + \/2, and choose a "deterministic set" of approximately 
(16/7r)<i(loglog<i) 2 points that are distributed on m = \(2/tt) loglogd] cir- 
cles. These circles have radii Rk = R(l — l/d)^ k ~ 1 ^ 2 ^ 2m for k = 0, 1, , ... , m— 
1, and each circle contains [~4-7r<i[~(2/7r) loglogd]] points at equal distances. 
This construction is in principle the same as in [HSS]. Secondly, we choose 
a "probabilistic set" of |~(300/"7r)<iloglog(f| points randomly inside the an- 
nulus A R = {z £ C: R(d - l)/d- 1/d < \z\ < R} for some R > 11. These 
deterministic and probabilistic sets of points will respectively find "thick" 
and "thin" roots, as defined below. Iterating Newton's method starting at 
these points (in parallel or in any order), we will find all roots of p with 
probability at least 1 — 1/d (or with any probability p £ (0, 1) when taking 
appropriately more points in the probabilistic set). 

Historical Remark. This research has its origins at the 50th anniversary 
celebration of the International Mathematical Olympiad (IMO) held in 2009 
in Bremen, Germany. One chief goal of this celebration was to bring together 
olympiad mathematics and research mathematics, and people involved in 
both. This paper was authored by a research mathematician who in his 
youth was one of the first contestants ever at IMOs and in 2009 was a guest 
of honor at the 50th IMO, together with one of the contestants there, and a 
research mathematician who was among the senior organizers of that IMO 
and its anniversary. This work is thus very much in the spirit of the IMO 
anniversary, and we are grateful to this anniversary celebration that has 
brought us together. 



1 - 1/d. 



we define a "fundamental 
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2. Channels and Their Moduli 

Consider a complex polynomial p(z) = cYlj =1 (z — ctj) and let N p (z) = 
z — p(z)/p'(z) be the associated Newton map. This is a rational map of 
degree d if all roots of p are distinct, and of lower degree otherwise. Without 
changing the Newton map, we may suppose that c = 1, and after rescaling, 
we may suppose that all ctj 6 D. 

For any root a of p, let U a be the immediate basin of a: the basin is the 
set of all z G C that converge to a under iteration of iV p , and the immediate 
basin is the connected component containing a. It is known that each U a 
is simply connected [Pr] and that the restriction of N p to U a sends U a to 
itself as a proper map of some degree k + 1 G {2,3, ... ,d}. We will use the 
construction and some results from [HSS] . If <p : U a — > D is a Riemann map 
with (f(a) = 0, then / := ip o N p o (p^ 1 is a proper holomorphic self-map of 
D of degree k + 1 and thus extends, by Schwarz reflection, to a rational map 
of degree k + 1, and the restriction of / to <9D is a covering of dH>, also of 
degree k + 1. In particular, the restriction of / to <9D has k > 1 fixed points 
q 1 ,...,q k . Set Aj := /'(<&), for i = 1, 2, . . . , k. 

The holomorphic fixed point formula (which essentially is the residue 
theorem for l/(z — f(z)); see [M]) implies that 

k 

i=i 

(with equality if the root a is simple) . Each of these k fixed points gives rise 
to a channel to oo in the immediate basin U a : for our purposes, a channel 
is an unbounded component Bi of U a \ D. Near oo, each channel is mapped 
by N p conformally to itself, and it defines an access to oo within U a that 
is fixed by N p . The quotient of Bi by the dynamics of N p is a conformal 
annulus with modulus /x, = 7r/logAj. 

Choose some positive real number M < 7r/log4 ~ 2.266 that will be 
specified later (we will eventually use M = 7r/loglogci for large d). 

We call a root a thick if it has a channel with modulus \ii > M, and thin 
if there is no such channel. We will treat these two cases separately. 

• We will explicitly and deterministically construct a set of [47r<i[2/M] 2 ] 
points that is guaranteed to intersect each channel of a root with 
modulus greater than M. This set will thus suffice to "find" all 
thick roots. 
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• The advantage of thin roots is that even though the individual 
channels have small moduli, the total area of these channels within 
any fundamental domain of the Newton dynamics is greater than 
in the thick case: each channel may have little area, but there 
are more channels in this case (see Figure 2). We show that if 
[SOOdlogd/Me 71 "^] points are distributed randomly in a certain 
fundamental annulus of the Newton dynamics, then the probability 
that the immediate basins of all thin roots contain such a point is 
at least 1 — 1/d. 

Remark 2. If a is a thin root, then all < M, hence all Aj — 1 = e^ 1 — 1 > 
e n/M _ i go ky foe nujn^gj- Q f channels of a thin root is strictly 
greater than e^l M — 1. But the mapping degree of U a equals k + 1, so U a 
must contain k > e n ^ M — 1 of the at most 2d — 2 critical points of N p , and 
thus the number of thin roots is at most (2d — 2)/(e 7r / M — 1). In the end, 
we will use M = tt/ log log d, so the number of thin roots will be at most 
(2d — 2)/(logd — 1): most roots will be thick. It seems to be an interesting 
question (outside the scope of this paper) to estimate how likely it is for a 
given polynomial of degree d to have all its roots thick. 
If there are thin roots, then we can estimate 

(2) e w/M < k + 1 < d ; 

in particular, there are no thin roots at all if M < 7r/logd. 

A conformal quadrilateral is a Riemann domain Q C C with two distin- 
guished connected and disjoint subsets of the boundary. In our setting, the 
boundary of Q may not be a topological curve, but the two distinguished 
boundary subsets will be; we will call them distinguished boundary arcs. 
Then there is a unique h > so that the domain Qh ■= {z £ C: < 
Im z < 1 , < Re z < h} has a Riemann map <p: Q — > Qh that maps the 
two distinguished boundary arcs onto the two horizontal sides of Qh (the 
Riemann map may not extend continuously to the boundary of Q, but it 
does so near the two distinguished boundary arcs; the general framework of 
extremal length using curve families works even if the boundaries are not 
curves) . The value h is defined as the conformal modulus of the quadrilat- 
eral Q with respect to the two boundary subsets, and denoted mod(Q); it 
is invariant for conformal homeomorphisms that respect the distinguished 
boundary subsets, in particular for Riemann maps with this property [A]. 

Identifying the two distinguished boundary arcs, we obtain a complex 
annulus (a doubly connected Riemann surface) with modulus mod(Q) or 
less (the exact modulus depends on how the boundaries are identified) . 

3. Hitting thick roots 

In this section, we will construct an explicit and deterministic set of start- 
ing points that is guaranteed to intersect the basins of all thick roots. Our 



UNIVERSAL STARTING POINTS FOR NEWTON'S METHOD 



7 



arguments are essentially the same as in [HSS, Section 5], except that we no 
longer need to find all roots, but only the thick ones. 

If R > (d + l)/(d — 1) and Cr is the circle of radius R centered at the 
origin, then N p maps Cr_ homeomorphically onto some topological circle 
around D, and there is some k > so that the round annulus 



is contained in the topological annulus between Cr and N p (Cr); specifically, 
if R > 1 + y/2, then k > 1/2 for all d. If R tends to oo, then k tends to 1. 
All this is [HSS, Lemmas 4 and 12]; see also Figure 3. 



Figure 3. Left: the dynamics of Newton's method for some 
complex polynomial. Highlighted is the immediate basin of 
attraction of one root, with fundamental domains within the 
channels shaded. Also shown is the circle at radius R and 
its image, which is a topological (but not geometric) circle. 
Right: the complex unit disk D provides a conformal model 
for the Newton dynamics of the immediate basin. (Picture 
taken from [HSS].) 

We will use the round annulus V = Vr^^ with R > 1 + \/2 and k = 1/2 
(if we use larger values of R, then we can take larger values of k, and our 
bounds will eventually be slightly better; however, in practice these starting 
points would be further away from the roots, and the iteration would take 
longer) . 

Remark 3. The modulus of V is | log((d — l)/d)\/A-K > l/4ird. 





f 
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Consider some channel B{. We want to define Qi as "the part of the 
channel Bi within V" . If each of the two boundary circles of V intersects 
Bi in a single connected arc, we set Qi := Bi n V. However, if Bi \ V has 
more than two connected components (see Figure 3), we need to be more 
careful. Consider the intersection of Bi with Cr, the outer boundary of V. 
Let 7 be any connected component in this intersection. It separates U a into 
two components, one of which contains the root a; then 7 will be called an 
essential boundary arc of Bi n Cr if the component ofU a \j not containing 
a is unbounded: this means that 7 separates the unbounded part of the 
channel Bi from the root. At least one component of Bi n Cr is essential; 
choose one such essential component 7, let 7' := ^(7), and let Q\ be the 
subset of U a that is bounded by 7 and 7' (if Bi intersects Cr and equivalently 
N p {Cr) in only one component, then Q\ is the part of Bi between Cr and 
N p (Cr); in general, the difference may consist of some number of bounded 
components). Then Q\ is a fundamental domain of Bi by the dynamics; 
when viewed as a quadrilateral with distinguished boundary arcs 7 and 7', 
then mod(Q^) > mod(-Bj) = fii (Q^ is a quadrilateral, the modulus of Bi is 
defined using the quotient annulus of Bi by the dynamics). 

Now let Cri be the inner boundary circle of V and consider all essential 
arcs of intersection of Bi D Cri. If there is only one, then let 7" be this 
essential arc. If there are several, then they are totally ordered (because 
they all separate a in U a from the unbounded component of Bi\V). Let 7" 
be the outermost component that separates a from 7 (i.e., the one closest 
to 7), and let Qi be the component of B^ \ (7 U 7") that is bounded by 7 
and 7". This is a conformal quadrilateral with Qi C Q\, and with 7 and 7" 
as distinguished boundary arcs, and we have mod(Qj) > mod(Q^) > m. 

Our task will be to distribute sufficiently many points into V so that we 
hit quadrilaterals Qi C V with moduli bounded below. 

Lemma 2. Let S = {z G C: - 1/2 < Re^ < 1/2} and let Q C C be a 

quadrilateral whose two distinguished boundary arcs are on the two vertical 
sides of S, one on each. Suppose that Q is disjoint from the set iL. Then 
the modulus of Q is at most 2. 

Proof. This is an easy extremal length exercise [A]. There is an integer 
n G Z so that any curve in Q connecting the two distinguished boundary 
arcs must intersect the segment [ni, (n + Without loss of generality, 
suppose that n = 0. 

Let B := {z G S: — 1/2 < Imz < 3/2} and let p be the characteris- 
tic function of B. Then for any curve 7 C Q connecting the two distin- 
guished boundary arcs, its intersection with B has length at least 1. Since 
J c p 2 dx dy = 2, it follows that mod(Q) < 2. □ 

Remark 4. The bound of 2 is not sharp. It is not hard to calculate the exact 
bound [A], but we are not optimizing constant factors here. 
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Figure 4. The annulus V (hatched). Its outer boundary- 
circle is Cr; the image N p (C r ) is a topological circle within 
the bounded complementary component of V. Also shown is 
a channel B^ it intersects Cr in four arcs, three of which are 
essential. Shaded is the quadrilateral Q\ which is bounded 
by two essential arcs, one on Cr and one on N p (Cr); it is 
a fundamental domain of Bi modulo N p . The quadrilateral 
Qi C Q\ is shaded darker: it is bounded by two essential arcs 
on dV, but may not be contained in V . 

Lemma 3. IfVis subdivided into at least 2/M concentric and conformally 
equivalent subannuli, and at least Aird\2/M~\ points are distributed onto the 
core circles of all subannuli, so that the points on all circles are equidis- 
tributed, then each quadrilateral Qi with modulus at least M contains at 
least one of these points. 

Proof. Let m := [2/M] and subdivide V into m concentric and conformally 
equivalent subannuli V\ , . . . , V m , ordered by decreasing radii (so that Vk = 
{z e V: Rp k < \z\ < Rp k ~ 1 } for /3 = (1 - l/d) 1 / 2 " 1 ) . Write Q for Q i; this 
is a quadrilateral for which the two distinguished boundary arcs are on dV, 
one on each boundary component of V. 

Subdivide Q into quadrilaterals Q' 1} . . . , Q' m as follows, similarly as above. 
The common boundary circle of Vj and Vj+i may intersect Q in several 
arcs; such an arc is essential if it separates the root a from the unbounded 
component of B{ \ V . Use an essential arc to separate Q'j from Q'j + i, for 
j = 1, 2, . . . , m— 1. (In the special case that BiDdVj only has two connected 
components, then simply Q'- = Bi C\Vj.) 

By the Grotzsch inequality, one of the quadrilaterals Q'j has modulus 
mod(Q^) > m ■ mod(Q) > \2/M]M > 2. Supposing for now that Q'j 
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Figure 5. The annulus V is subdivided into m = 3 concen- 
tric subannuli, all of equal moduli. The logarithm unfolds 
these annuli to vertical strips (moved apart to show them 
separately). Highlighted is the intersection of one channel 
with V. The quadrilateral in the channel corresponding to 
the middle subannulus is shown in a darker shade: notice 
that it intersects the other subannuli as well. 

and taking logarithms, the annulus Vj becomes an infinite vertical strip of 
width | \og((d — l)/d)\/2m > l/2md, and Q'j becomes a quadrilateral that 
connects the two boundary sides of the strip; see Figure 5. 

By Lemma 2, appropriately rescaled, each annulus of modulus 2 intersects 
the central vertical line within this strip in a straight line segment of length at 
least 1 /2md. Therefore, placing an infinite sequence of points on any vertical 
line within the strip so that adjacent points have distance less than l/2md, 
one can be sure that at least one of these points intersects the annulus. The 
exponential map projects the strip back onto Vj as a universal cover and has 
period 2iri, so the required number of points on Vj is Airmd = 4ird\2/M] . 

If Q'j happens to contain the point z = 0, then one cannot take the log of 
Q'j ; but one can take the log of Q'j n Vj and transport the function p in the 
proof of Lemma 2 into Q'- n Vj. This suffices for the conclusion to remain 
valid. □ 

Corollary 4 (Deterministic Starting Points for Thick Roots). 

For every d there is an an explicit set consisting of \4jrd\2/M~\~\ \2jM\ 
I6ird/M 2 points in V so that for each p G Vd and each thick root of p, at 
least one point in Vd is contained in the immediate basin of this root. 

Proof. Using the construction described in Lemma 3, we have m = \2/M~\ 
circles, and each circle contains [47rd["2/Af|] points. Hence the total number 
of required points is as claimed. These points intersect each quadrilateral 
Qi and thus the immediate basin of each thick root. □ 
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4. Hitting Thin Roots 

Our goal in this case is to find a good lower bound for the area of the 
union of all channels of any root, guaranteeing us that we will hit one of 
the channels with high probability if we distribute sufficiently many points 
randomly on a specified annulus. The area of intersection of a channel with 
modulus Hi with an annulus will be bounded below by some multiple of Hi, 
so the total area of intersection of an immediate basin with the annulus will 
be proportional to Y^/Xj, summed over all channels of the root. We thus 

start with a lower bound for Yli=i tH- 

k 



Set di = Trzrp so that Y2i=i a i — 1- We have 



TT TT 

Hi ~ 



log A, log(l + J-)- 

Since Hi < M for all i, we get that m < l/(e 7r/M - 1) for all i. 
We want to find a lower bound for 

k k 

TT 



^ log(l + l/ai) 

subject to the conditions Yli=i a i — 1 an d a i < l/(e 7r / M — 1). 

Lemma 5. The function f : R + — > K + , f(x) = 7r/log(l + 1/x) is strictly 
monotonically increasing and concave (i.e., its graph is above the line seg- 
ment through any two points on it). 

Proof. It suffices to prove that /' is positive and monotonically decreasing. 
This is a straightforward exercise. □ 

Lemma 6. If Hi < M for all i £ {1, . . . k}, then Y^ =1 Mi > \Me«l M . 

Proof. Without loss of generality, assume that a± > 02 > ■ ■ ■ > a^, and that 
di = 1. We now consider the sequence (61, . . . &&) defined by 

wirr if < < L^ /M - ij 
i- 1 ^ ifi=L^-iJ + i 

if i > [e w / M - 1J + 1 . 

Then we also have b = 1> and since all Oj < e7r /li_ 1 > h follows that the 
sequence (bi, 62, - - - &fe) majorizes the sequence (a±, ai, . . . a^), in the sense 
that 



1 



for all m € {1,2,..., fc}, with equality for m = k. Since the function / is 
concave by Lemma 5, we get from Karamata's inequality (see [HLP, Thm. 
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108]) that fiat) > £ f(bi) and thus 

k [e 7r / M -lj 

£/(*)> E m) = ^ /M -i\-f(^ l )>{e^-2)M. 

i=l i=l 



Since M < we have e w/M > 4 and thus 



k k 

E * = E /(«*) > M ( eVM - 2 ) > ^ Me?r/M 



i=l i=l 

as claimed. □ 

Let ip: (C \ D) — > C be a linearizing map near oo of N p , i.e., ip(N p (z)) = 
ip(z)(d— l)/d with ip(oo) = oo, and normalize so that ip(z)/z — > 1 as z — >■ oo. 
Let 

VF fi := G C: R{d-\)/d < \w\ < R} 
be a fundamental domain in linearizing coordinates. 

Lemma 7. For any channel Bi, we have 

\iP(B t ) n W R \ > mod(Bi)R 2 /d 2 . 

Proof. This is another elementary exercise using extremal length: fix a chan- 
nel Bi and let B := ip{Bi) n Wr. By conformal invariance, the modulus of 
Bi equals the modulus of B where the boundaries are identified by multi- 
plication by (d — l)/d, and this is 



(modBi)- 1 = (modB)- 1 = supinf 



-J _ , ... i r>\- ■ 

" p t llP^lls 

where p: B — >■ R + are measurable functions, 7 : [0, 1] — > B are smooth curves 
with 7 (1) = 7 (0)(d - l)/d, and £(7) = £ p( 7 (0) |V(*)| 

We simply set p = 1\b (the characteristic function of B). HA denotes the 
Euclidean area of B, then ||p 2 ||_B = A. The two boundary circles of Wr have 
radii R and R(d - l)/d, so £(7) > fl/d. Therefore, 1/modB > R 2 /d 2 A or 
,4 > mod(B)R 2 /d 2 = mod(Bi)R 2 /d 2 . □ 

Lemma 8. For R > 5, i/ie intersection of the annulus 

r ^ d- 1 1 . „ 

A fi = zeC: — ;— -R- 1 < U < -R 
[ d d 

with a channel of modulus p has area at least 

p_ (R - 1) 2 {R - 3) 2 
d2 4(i? + l) 2 ' 

Proof. Consider the circle Cr := {z € C: \z\ = R}, and the image C' R := 
N p {Cr). Then C' R is another topological circle with absolute values between 
R(d-l)/d-l/d = R- (R + l)/d > (R-l)/2 > 2 and R(d- l)/d+ 1/d = 
R — (R — 1) / d < R. Let Zr be the annulus bounded by and C^; it is 
a fundamental domain for the Newton dynamics, and we have Zr C Ar. 
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Consider a channel B and set Br := B n Zr; this is a fundamental domain 
of the channel, but not necessarily connected. 

Consider again the linearizing function ip: C \ D — > C of N p , normalized 
as ip(oo) = oo and ip(z)/z — > 1 as z — > oo. The Koebe distortion theorem in 
this normalization yields 

tp'(z) I \z\ + l 



< 



\z\ + l) ~ ij(z) 
Define the sets 

-i 



< 



1) 



B n :={zeB R :R — — <\^{z)\<R 



d-lj Irwi \d- 1 

for n G Z. Each area element in i? n is mapped into Wr by the map z i— >■ 
tp(z)((d — l)/d) n with derivative 

H'V)\( d -^-Y<R l M<^- l A ±1 < 2R R+1 



d J \il){z)\ \z\ \z\-l R-l (R- 3) ' 

where we used the Koebe theorem in the second inequality and then \z\ > 
(R — l)/2. This yields a diffeomorphism from Br to ip(B) n VFr, except for 
discontinuities at the finitely many boundary arcs of the B n . 

The set ip(B) intersects Wr in a set of area R? mod(B)/d 2 by Lemma 7, 
and areas in B n are distorted by a factor of no more than the square of the 
derivative. This implies that 

. (R-l) 2 (R-3) 2 

1^' > 4d 2 ( R + iy mod(B) 

as claimed. □ 

Lemma 9. Let R > 5 one? consider the annulus Ar defined as in Lemma 8. 
Choose a probability p G (0, 1). // 

|log(l-p)|+logd R(R+1) 3 

Me w / M '(^-1)2(^-3)2 

points are randomly and independently distributed in Ar, then for any poly- 
nomial p G Vd, each thin root has at least one of these points in its immediate 
basin with probability at least p. 

Proof. The area of all channels within Ar of any fixed thin root is at least 
{{R-l) 2 {R- 3)2/4^(^+1)2)^^. by Lemma 8, and £ W > \Me*l M 
by Lemma 6. A simple calculation shows that the area of Ar is less than 
2ttR(R+ l)/d. Therefore, the probability that a point chosen randomly in 
Ar will lie in one of the channels of this root is at least 

q ~ 16vrd R(R + l) 3 ' 

Now, suppose that we distribute some (large) number K of points on the 
annulus Ar, randomly and independently. Then the probability that we 
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do not hit one of the channels of some fixed thin root will be at most 
(1 — q) K . Since there are at most d thin roots, the probability that there is 
some thin root the channels of which are not hit is at most d(l — q) K . We 
need to make K large enough so that d(l — q) K < 1 — p, hence we need 
tf>log((l-p)/d)/log(l-g). 

Since log(l — q) < — q < 0, we have 

log ((1 - p)/d) ^ log(l - p) - \ogd = |log(l - p)\ +\ogd 
log(l - q) -q q 

|log(l-p)| + logd R(R+1) 3 



= 16vr^ 



Me*/ M {R-l) 2 (R-3y 



so it suffices to distribute this number of points within the annulus at random 
so that, with probability at least p, at least one channel of each thin root is 
hit. □ 

Remark 5. Increasing the radius R will decrease the necessary number of 
points to asymptotically 167rd(| log(l — p)| +logd) /Me n / M for large R. The 
disadvantage is that the required number of iterations will be very large 
until the roots are reached. In this article, we do not optimize the number 
of starting points vs. the number of iterations: indeed, it is possible to 
optimize all constants by refining several of our estimates (see below). 



5. Conclusion 

Proof of Theorem 1. We have to distribute lQird/M 2 points within the an- 
nulus V by the algorithm described in Section 3 to be sure that all thick 
roots are found. To hit the thin roots, we consider the annulus Ar de- 
fined as in Lemma 8, where we choose R = 11 (see Remark 5) so that 
R(R + l) 3 /(^ - 1) 2 (R - 3) 2 = 2.97; in order to hit find all the thin roots 
with probability at least p = 1 — 1/d, we thus have to randomly distribute 

16 • 2.977rd(|log(l - p) \ + log d) /Me w/M < 300dlog d/Me w/M 

points inside the annulus An (in both statements, we ignored the condition 
that we need to round up certain numbers). 
This gives us a total of 

16vrd 300dlogd 

Me n / M 

points to be chosen to hit the channels of all roots with probability at least 
1 — 1/d. In particular, setting M = 7r/loglogd, it suffices to use at least 

pf 1 f } \ = — (loglogd) 2 + —d\oglogd = 0(d (loglogdf) 



\ log log d J TT TT 

points. □ 
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Remark 6. Strictly speaking, this proof only works for d > e 4 « 54.6 as 
we claimed in the beginning that M < 7r/log4 and finally chose M = 
7t/ log log d. However, we only need this to simplify some term in the proof 
of Lemma 6; for 2 < logd < 4, by being a little bit more careful in the 
proof of Lemma 6 one can even get slightly better constants, whereas for 
1 < logd < 2 one has to choose another value for M to get the same final 
upper bound. 

Remark 7. Of course, the probability 1 — 1/ d can be replaced by any prob- 
ability p £ (0, 1) by appropriately increasing the number of points. For 
M = 7r/ log log <2, the number of points to find the thin roots then becomes 
0(dloglogd(l + | log(l — /£>) | / log d) . Including thick roots as well, and ig- 
noring dominated terms, the total number of points becomes 

O (d(loglogd) 2 + dloglog(2| log(l - p)\/\ogd) 
< O (d(loglogd) 2 + d|log(l - p)\) . 

This will not even change the leading term of the number of points as long 
as p < 1 - l/rflogrfloglogd. 

Remark 8. At several places, we preferred the simple argument over optimal 
numerical values, as far as constant factors were concerned. If one were to 
optimize these factors, it would involve the following places. The thick 
roots have the higher complexity, so asymptotically it is most important 
to optimize constants here. In Lemma 2, the modulus of a quadrilateral is 
estimated only roughly using a simple argument. The precise value of this 
quadrilateral can be determined using elliptic integrals; this has been done 
in [HSS] in an analogous situation. One could then optimize the number 
of circles and the number of points on them: taking more (or fewer) circles 
would allow us to use fewer (more) points on each of them, and there is an 
optimal value of circles that minimizes the total number of points. 

For thin roots, we used the estimate e?l M — 2 > e?l M jl at the end of the 
proof of Lemma 6, and for large d this loses a factor of 2. Moreover, in the 
proof of Lemma 9 one could gain a factor of 2 by using a fixed probability 
p, rather than p = 1 — 1/d. Finally, there is a certain loss in the estimation 
of probabilities of hitting the d different basins; these probabilities are not 
quite additive as estimated. Our estimates in the thin case are roughly a 
factor 4 away from being optimal. And of course, one can reduce the radius 
R of the starting points, and thus the required number of iterations, at the 
expense of increasing the number of starting points. 

Remark 9. Since the complexities of the deterministic and the probabilistic 
parts are different, it is tempting to reduce the total complexity by choosing 
a value of M different from 7r/loglogd so that both partial complexities 
become closer to each other. Slight improvements are indeed possible that 
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way, but the gain seems to be minimal. For example, one has 

P ( 1 ^ = o (d floe log d)2-2/(i+io g iogdA 

^(loglog^i-VCi+iogiogd) J u ^ogiogtfj j . 

In this case, the deterministic term is still much bigger than the probabilis- 
tic one. Such calculations seem to become much more complicated with 
relatively little gain. 

Moreover, we have not used the condition J2 ai h < 2d — 2 coming from 
the total number of "free" critical points. We believe that the effect of 
incorporating this condition will be marginal. 
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